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Abstract 

We present a Bayesian approach to the problem of determining parameters 

for coalescing binary systems observed with laser interferometric detectors. 

By applying a Markov Chain Monte Carlo (MCMC) algorithm, specifically 

the Gibbs sampler, we demonstrate the potential that MCMC techniques may 

hold for the computation of posterior distributions of parameters of the binary 

system that created the gravity radiation signal. We describe the use of the 

Gibbs sampler method, and present examples whereby signals are detected 

and analyzed from within noisy data. 
04.80.Nn, 02.70.Lq, 06.20.Dq 
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I. INTRODUCTION 



A number of collaborations around the world will be operating laser interferometric 
gravitation radiation antennas within the next few years. In the United States the Laser 
Interferometric Gravitational Wave Observatory (LIGO) is soon to be operational, with 4 
km arm length interferometers in Hanford, Washington, and Livingston, Louisiana ||T|. A 
similar French- Italian detector will be built in Europe (VIRGO) PS. 

Coalescing binaries containing neutron stars (NS) or black holes (BH) are likely to be the 
cleanest and most promising source of detectable radiation ||. Ultimately the LIGO- VIRGO 
network may observe binaries out to a distance of 2 Gpc f|. The detection of coalescing 
binary events will provide physicists with extremely useful cosmological information. Ini- 
tially Schutz |J noted that a detected signal contains enough information to decipher the 
absolute distance to the system, and hence the determination of the Hubble constant would 
be achieved through the observed distribution of several binaries. Subsequent work indi- 
cates that the uncertainty in the measured distance can be comparable to the distance itself, 
but important cosmological tests will still be possible through the observation of numerous 
mergers ||. 

In addition to the cosmological importance, accurate parameter estimation in the ob- 
served coalescing binaries will provide a host of information of great physical significance. 
Observation of the time of tidal disruption of an NS - NS binary system may permit a deter- 
mination of the NS radii and information on the NS equation of state ||. The characteristics 
of radiation in the post-Newtonian regime will provide insight into highly non-linear general 
relativistic effects |7| Jl^JTT|1 . The formation of a BH at the end of a NS-NS coalescence, or 
the merger of two BHs, will produce gravitational radiation as the system decays to a Kerr 
BH; this is an extremely interesting radiation production regime |T0| , pT| . 

Application of Bayes' theorem is well suited to astrophysical observations [Q. The 
Bayesian versus frequentist approaches to gravitational radiation data analysis are well pre- 
sented by [pi. Parameter estimation from the gravity wave signals of coalescing compact 
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binaries provides an important application of Bayesian methods 0,0,[L4],|T^] . Difficulties with 
the calculation of Bayesian posterior distributions have been overcome by the rapid devel- 
opment of Markov Chain Monte Carlo (MCMC) methods in the last decade (see [HJ for 
an introduction). Although the initial MCMC algorithm dates back to |T7|], the enormous 
potential that MCMC methods might hold for Bayesian posterior computations remained 
largely unrecognized within the statistical community until the seminal paper by Geman 



and Geman |L8[] in the context of digital image analysis. Since then, MCMC methods have 
had a huge impact on many areas of applied statistics. It has now become practical to apply 
Bayesian methods to complex problems. Thus, we expect a similar effect on gravitational 
wave data analysis. 

The initial goal of our research effort, presented in this paper, is to demonstrate the 
usefulness of MCMC techniques for estimating parameters from coalescing binary signals 



detected by laser interferometric antennas. The Gibbs sampler [[T(J is one of the simpler 
MCMC techniques, and we use it as a starting point for our investigation primarily because 
there is readily available software |19||. Our study of gravity wave signals is conducted to 



2.5 post-Newtonian (PN) order. The signals depend on five independent parameters; the 
masses of the two compact objects, the amplitude of the detected signal, the coalescence 
time and the phase of the signal at coalescence. The Bayesian techniques we employ will 
not only give point estimates of these parameters, but also produce their complete posterior 
probability distribution that can be employed to summarize the uncertainty of parameter 
estimates through posterior credibility intervals, for instance. In contrast to frequentist 
confidence intervals, these do not rely on large sample asymptotics and have a simple, 
natural interpretation. 

The paper is organized as follows: In Section II we briefly review Bayesian inference and 
describe the MCMC simulation technique we use, specifically the Gibbs sampler and software 
for its implementation. In Section III we present two examples where we use our MCMC 
approach to identify the parameters which created the signal that is buried in synthesized 
LIGO noise. In Section IV we analyze a number of issues that will effect the efficiency 



and calculational time of a MCMC approach to the coalescing binary parameter estimation 
problem. We conclude with a discussion of our results and the direction of future efforts in 
Section V. 



II. B AYES IAN INFERENCE AND POSTERIOR COMPUTATION 

We briefly review the Bayesian approach to parameter estimation. Let us assume the 
data consists of n observations, z = (zi, . . . , z n ), with joint PDF denoted by p(z\0) condi- 
tional on unobserved parameters = (#1, . . . , 9 d ). The PDF p(z\0) is usually referred to 
as the likelihood and regarded as a function of 0. In contrast to the frequentist approach 
where is regarded as fixed but unknown, the Bayesian approach treats as a random 
variable with a probability distribution that reflects the researcher's uncertainty about the 
parameters. Bayesian inference requires the specification of a prior PDF for 0, p(0), that 
should take all information into account that is known about before observing the data. 
All information about that stems from the experiment should be contained in the likeli- 
hood. Bayesian inference then answers the question: "How should the data z change the 
researcher's knowledge about 0T Via an application of Bayes' theorem, by conditioning on 
the known observations, this post-experimental knowledge about is expressed through the 
posterior PDF 



malizing constant as it is independent of 0. The posterior PDF is thus proportional to the 
product of prior and likelihood. 

The standard Bayesian point estimate of a single parameter, say 9i, is the posterior mean 



p(0\z) 



m(z) 



ocp(0)p(z\0) 



(1) 



where m(z) = J p(z\0)p(0)d0 is the marginal PDF of z which can be regarded as a nor- 




(2) 



where 
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P (9i\z) = J ... J p{o\z)de l ...de i „ 1 de i+l ...do d . (3) 

is the marginal posterior PDF obtained by integrating the joint posterior PDF over all other 
components of 6 except 9{. A measure of the uncertainty of this estimate is the posterior 
standard deviation or a 95% credibility interval that contains the parameter 9i with 95% 
probability, its lower and upper bound being specified by the 2.5% and 97.5% percentile 
of p(9i\z), respectively. Alternatives to the posterior mean are the posterior mode, aka 
maximum a posteriori estimate (MAP), and the more robust posterior median. 

As seen from equations (2) and (3), the calculation of posterior means requires d- 
dimensional integration, one of the main issues that has made the application of Bayesian 
inference so difficult in the past. This hurdle has been overcome by the great advances in 
simulation-based integration techniques, so called MCMC methods plpPf . In MCMC, a 
Markov chain is constructed with the joint posterior as its equilibrium distribution. Thus, 
after running the Markov chain for a certain "burn-in" period, one obtains (correlated) 
samples from the limiting distribution, provided that the Markov chain has reached conver- 
gence. One popular construction principle is the Gibbs sampler, a specific MCMC method 
that samples iteratively from each of the univariate full conditional posterior distributions 

p(9 i \z,9 1 ,...,9^ 1 ,9 i+1 ,...,9 d ). (4) 

Given an arbitrary set of starting values 9^ , . . . , 9 d the algorithm proceeds as follows: 

simulate 9{ 1] ~ p{6 1 \x, 9^\..., 9 { d ] ) 

simulate 9^ ~ p(9 2 \z, 9? , 9 { ° ] , ...,9 { d ] ) (5) 
simulate 9 d 1] ~ p(6 d \z, 9? 9^) 



and yields 0^ = (#[ m \ . . . ,9 d m ^) after m such cycles. This defines a Markov chain that 
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converges to the joint posterior as its equilibrium distribution |]16| . Consequently, if all 



the full conditional posterior distributions are available, all that is required is sampling 
iteratively from these. Thereby, the problem of sampling from an ci-variate PDF is reduced 
to sampling from d univariate PDFs. 

In many applications where the prior PDF is conjugate to the likelihood, the full con- 
ditionals in fact reduce analytically to closed-form PDFs and we can use highly efficient 
special purpose Monte Carlo methods for generating from these (see e.g. HHI1 ). In general, 
however, we need a fast and efficient black-box method to sample from an arbitrarily com- 
plex full conditional posterior distribution in each cyclic step of the Gibbs sampler. Such an 
all-purpose algorithm, so-called adaptive rejection sampling (ARS) was developed by Gilks 
and Wild [^] for the rich class of distributions with log-concave densities. We can use a 
recently developed "Metropolized" version of adaptive rejection sampling (ARMS) for non- 
logconcave distributions []17p3| . C-subroutines of ARS and ARMS are available 0] and 
can thus be tailored to the full conditional posteriors of the problem at hand. 

Significant progress has been made in facilitating the routine implementation of the 
Gibbs sampler with the help of BUGS (Bayesian Inference Using Gibbs Sampling), a recently 
developed software package by the Medical Research Council Biostatistics Unit, Institute 
of Public Health, Cambridge, England. BUGS samples from the joint posterior distribution 
by using the Gibbs sampler. For reviews on BUGS the reader is referred to [[24H 26I. BUGS 
is available free of charge from 

\protect\vrule widthOpt\protect\href {http : / / www . mrc-bsu . cam . ac . uk/bugs/Welcome . htmlMhtt 

BUGS can handle the two main tasks necessary for implementation of the Gibbs sampler. 
These tasks are to i) construct and ii) to sample from the full conditional posterior densities. 
Only the prior and sampling distributions for unobservables and observables, respectively, 
have to be specified in a BUGS program. The tedious task of constructing the full condition- 
als is automated by BUGS using directed acyclic graphs J^J . Sophisticated routines such as 
adaptive rejection sampling to sample from log-concave full conditionals and MH algorithms 
based on slice sampling to sample from non-logconcave full conditional densities have been 
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implemented in BUGS and are continuously being refined. Furthermore, various methods 
to assess convergence, i.e. methods used for establishing whether an MCMC algorithm has 
converged and whether its output can be regarded as samples from the target distribution of 
the Markov chain, have been developed and implemented in CODA [p^j . CODA is a menu- 
driven collection of SPLUS functions for analyzing the output obtained by BUGS. Besides 
trace plots and the usual tests for convergence, CODA calculates statistical summaries of 
the posterior distributions and kernel density estimates. CODA is being maintained and 
distributed by the same research group responsible for BUGS. 

III. EXAMPLES OF COALESCING BINARY SIGNALS DETECTED BY A 

LASER INTERFEROMETER 



Our initial goal, presented in this paper, is to demonstrate the usefulness of MCMC 
techniques for estimating parameters from coalescing binary signals detected by laser in- 
terferometric antennas. We generated the coalescing binary gravity wave signals to 2.5 PN 
order in both the time |29| , |30| and frequency domains. Noise that simulates the LIGO 
II environment was synthesized in the time domain using software from Finn and Daw j32 



The power spectral density of the noise was calculated from long-time noise signals gen- 
erated by |p2|| . The 2.5 PN frequency domain signals also served as the templates for our 
extraction of the event parameters. 

The total output registered by the detector, z (t), is the sum of the gravity wave signal, 
s(t, 6), that depends on unknown parameters 8, and the noise n(t), namely z (t) = s (t, 0) + 
n (t) for t G [0, t u ]. We assume that the noise is Gaussian with mean zero and known 
one-sided power spectral density S n (f). The signal-to-noise ratio (SNR) of the detected 



signal is SNR = y 2 8 ^P*ff df with s (/) = s (t) e 2m ^ l dt the Fourier transform of 
the function sit) ||. The likelihood is given by 

p(z|0) = Kexp [2 (z, s(0)) - (s(0), s(0)>] (6) 

where K is a constant and (a, b) = df ^^jjfi- the inner product of two functions a, b ||. 
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Since z(f) = z*(—f) for real signals, we can express the likelihood as 

,Re{(z(f)-s(f,O))(z(f)-s(f,0))*} 



p(z\6) = K exp 



-2 / df- 
'o 



Sn(f) 



(7) 



For discretized data the likelihood takes the form 

Re {z{i * Af) - §{i * Af, 0)) {z{i * Af) - s(i * Af, d))* 



p(z\6) = K * exp 



t=H 



S n (i * A/) 



(8) 



where %i * Af and i u * Af correspond to the lower and upper limits of the frequency range 
examined and Af is the resolution of the discretized domain data. 

As detailed in pi] , the signal depends on five parameters, namely the masses 777,1,777,2 
of the two compact objects, the coalescence time t c , the phase of the wave <po, and the 
amplitude N through 



5^ 0) = jy e ^o y-7/6 e i(V>(/)+27r/t c ) 



with 6 = (mi, m 2 , t c , (po, N) and 



i=l 



where 



(9) 



(10) 



a 2 
a 3 

a 5 



-g- 5/3 , 



12877- 

1 /3715 rr \ _ x 
128?7 y 

3 / 15293365 27145 3085 „ 1 _ ., 

-77 H 77 5 7,3 



1287/ V 508032 ' 504 

7T / 38645 \ 
1287/ V 252 + 7 ' 



+ 



72 



(11) 



and ft = r 5 / 3 , &(/) = r 1 , &(/) = /- 2/3 , = /" 1/3 , = ln(/), the total mass 

m t = m! + m 2 , q = TcGm t /c 3 , and the mass ratio 77 = mim 2 /mf. 
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The templates for the gravity wave signal, s(f, 0), were always generated in the frequency 
domain to 2.5PN order according to the formulas given in because the entire likelihood 
computation is also done entirely in the frequency domain. Examples of our Gibbs sampler 
code for use with BUGS can be obtained electronically |33|j . 



A. Results for Signals F in the Frequency Domain 

The published 2.5 PN order time domain templates are not quite the exact Fourier 
transform of the 2.5 PN frequency domain templates, hence we felt the necessity to test our 
method with signals we generated in both time or frequency space. For signals generated in 
the frequency domain we used published 2.5 PN order results For this example we chose 



the masses of the compact objects to be m x = 1.4M and m 2 = 3.5M , the coalescence time 
was t c = lms, and the phase of the wave of <£>o = 0.123. The amplitude of the wave was 
adjusted to create appropriate SNR values. The noise was generated in the time domain 
|32|| , and subsequently transformed to the frequency domain. 

We assumed noninformative a priori distributions for all of our parameters; namely a 
uniform distribution on 0.3M Q to 12M & for each of the two compact objects {mi and m 2 ) 
and a uniform distribution on to 50ms for t c . The gravity wave phase will lie between and 
— 7i to 7T for (f , but we assume a uniform a priori distribution between — 2tt to 2tt so that the 
converged chain can more easily sample its region of interest. The amplitude of the incoming 
gravity wave has a dependence of h (f) = Nrj^-^m^ 6 /~ 7//6 . Our simulated signal thereby 
has m t = 4.9M Q and t] = 0.2041. We used a uniform a priori for the amplitude term N on 
the interval — 10~ 19 to 10~ 19 . This was subsequently renormalized with a factor of 10 25 for 
computational reasons yielding a uniform a priori distribution for N on the interval — 10 6 to 
10 6 . We assumed that the compact objects had no spin, and hence these parameters were 
not included in this study. 

Unique gravity wave signals do not depend on mi and m-2, but instead on the total mass 
m t and the mass ratio m Hence it is the posterior probability distribution functions of m t 
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and 7] that provide the best information about the system. The posterior distributions of mi 
and m 2 from the MCMC are unnecessarily wide due to oscillations of each chain between 
the two possible values. Estimates and statistical properties of mi and wi2 must be inferred 
from the distributions for rnt and r\. 

In our initial implementation of the Gibbs sampler it was found that it takes a pro- 
hibitively long time for the chain to burn in and sample from the correct posterior distri- 
bution. Instead, an efficient procedure that allowed the chain to more efficiently explore 
the phase space was one that is analogous to simulated annealing |34]. In this procedure 



we use a likelihood of the form L = K exp [2 (z, s) — (s,s)] with (z,s) = df Uj , 
(s,s) = df ^ . The auxiliary variable T is a pseudo-temperature. If T is chosen 
large, T >> 1, we have heating and the MCMC does not get trapped in particular regions of 
phase space for too long. It essentially corresponds to increasing the variance of the poste- 
rior distribution to allow for wider jumps. Thus, the chain can reach all regions of the state 
space. In our study we typically start with T~500 and allow the chain to "burn-in" and 
find equilibrium. For each value of T the mean values for the parameters are computed and 
used as the starting values for the next chain with its reduced value of T. The T term can 
be quickly brought to T = 1 and the final kernel densities generated. Combining simulated 
annealing with MCMC samplers has been demonstrated to improve the efficiency of chains 



that mix poorly in their phase space 35 



When we analyzed the data in our MCMC program we had a frequency resolution of 
Af = 1Hz and only utilized the frequency range 30Hz — 730Hz; this range was based on 
reasonable assumptions of LIGO performance. Reducing the upper frequency limit does 
not greatly effect the performance of the MCMC results; this makes sense because for a 
coalescing binary signal most of the power of the signal is at the lower frequencies. The 
largest amplitude signal may happen at a high frequency, but the binary emits more cycles 
at lower frequencies. Although we have not studied this issue in depth, it appears that good 
MCMC performance will result even if one only includes frequencies up to ~200Hz. 

In Figs. 1-3 we present the results of this part of our investigation. We see that for large 
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SNRs we can accurately predict the parameters producing the signal. At the lower SNRs 
it takes the chain much longer to burn-in and find the correct parameters, and when the 
SNR is too low the chain will not converge and no useful information will be discerned. 
The results for SNR = 4.4 are further illustrated in Fig. 4, where we can observe the kernel 
densities for the parameters generated from the Gibbs sampler. 

Fig. 5 displays the operation of the Gibbs sampler, whereby the trace plots for the 
parameters are displayed. This is the result for SNR = 4.4 when we have set the annealing 
temperature to T = 100. One can see how the chain burns-in and achieves convergence. We 
would let the chain run for ~ 10,000 iterations, compute the mean values for the parameters 
(excluding points prior to burn-in), and use the calculated mean parameter values as initial 
conditions for a new Markov chain with a diminished annealing temperature. Once we have 
a T = 1 then 10,000 iterations typically produce an adequate and informative kernel density. 

Extensive convergence diagnostics were calculated for all of the parameters using the 



CODA software |zq| . All chains passed the Heideberger - Welch stationary test. The Raferty 
- Lewis convergence diagnostics confirmed the thinning and burn-in were sufficient. Lags 
and autocorrelations within each chain were reasonably low. Geweke Z-scores were low for 
all parameters. These convergence diagnostics are described in |23| (and references therein). 

Continuing the use of the SNR = 4.4 example, we can decipher the masses of the in- 
dividual compact objects. The result of the Gibbs sampler can give us estimates of these 
parameters by using simple summary statistics like the sample average and empirical per- 
centiles. These yield posterior means of m t = 4.891M and r] = 0.2048, plus 2.5 to 97.5 
percentile ranges of m t = (4.882 to 4.898) M & and rj = 0.2043 to 0.2055. This then implies, 
compact object masses of m x = (3.485 ± 0.01) M and m 2 = (1.406 ± 0.008) M . 

B. Results for Signals Generated in the Time Domain 

When we generated signals in the time domain the parameters were also chosen arbi- 
trarily. The generated signals were made to 2.5 PN fl2"9"|j3"0|] . The masses of the two compact 
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object were 8M and 9M , the angle of inclination of the orbit was i = tt/4, and the ori- 
entation of the gravity wave source with respect to the laser interferometer was <p = 2.2, 
9 = 1.1, and ip = 3.3 (all in radians). In order to adjust the signal to noise ratio we effec- 
tively changed the source to detector distance. The interferometer noise was also computer 
generated [52|, and the signal and noise were summed together. We created signals of 32s 
length, with 16384 data points per second. The temporal signal was Fourier transformed to 
the frequency domain. 

Templates for the likelihood were again 2.5 PN | 3lfl . The results below were based 
on a MCMC investigation that ranged from 30Hz — 130Hz, with frequency resolution of 
Af = 1Hz. Increasing the upper frequency did not affect the results, but only slowed the 
calculation. This is again consistent with the fact that most signal power is at the lower 
frequency. The choice of an upper frequency for the MCMC will effect the speed of the 
calculation and ultimately its ability to accurately estimate parameters; this is a topic we 
are currently investigating and will be the subject of a future publication. 

In Figs. 6-8 we present the results of this part of our investigation. We see that for large 
SNRs we can again accurately predict the parameters producing the signal. The results 
for SNR = 4.4 are presented in Fig. 9, where we can observe the kernel densities for the 
parameters that we generated from the Gibbs sampler. Fig. 10 displays the operation of the 
Gibbs sampler, whereby the trace plots for the parameters are displayed. This is the result 
for SNR = 4.4 when we have set the annealing temperature to T = 100. One can see how 
the chain burns-in and converges. 



IV. ANALYSIS ISSUES 

There are a host of topics that need to be addressed before one could say that MCMC 
techniques will be truly applicable and useful with LIGO data. However, we have demon- 
strated that the Gibbs sampler does have potential usefulness, and can successfully find the 
signal and make statistical statements about the parameters. Numerous issues pertaining 
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to MCMC use with LIGO data are discussed below. 

The speed at which the MCMC calculation runs on computers is a concern of paramount 
importance. Numerous issues influence the speed of the calculation. For the study presented 
in this paper, we would generate 32s of data in the time domain. The data corresponded to 
a sampling rate of lQ38AHz. Due to the character of the signal source and LIGO noise we 
only considered signal frequencies above 30Hz. 

The choice of an upper frequency limit can significantly influence the speed of the MCMC 
program. However, one can not arbitrarily reduce the upper frequency too much or the 
ability to estimate parameters will degrade. If our templates only consider the orbital 
parameters of the binary system, and not the ringdown of the newly formed black hole then 
the largest useful frequency will correspond to twice the instantaneous orbital frequency of 
the last stable orbit before free-fall, / = c 3 / {Q^^nGm^j . In this study we only considered 
compact objects with masses between O.3M and 1OM . The frequency could therefore 
range from as large 7300Hz for two O.3M objects, to 1570Hz for two 1.4M neutron stars, 
to TlOHz for two 1OM black holes. One should also remember that the Fourier transform 
of the signal falls off like \h (f)\ oc /~ 7 / 6 . Consequently the power of the signal is dominated 
by low frequencies; the binary spends more time emitting relatively low amplitude gravity 
waves at lower frequency as opposed to a shorter time producing larger amplitude signals 
at higher frequencies. Establishing an effective upper frequency choice will depend on the 
decision for the a priori distribution of the masses. It will also depend on analyses that 
investigate the behavior of the MCMC as the upper frequency varies. We anticipate that 
effective parameter estimation studies of events will vary the upper frequency and search 
for a convergence in behavior. As the upper frequency of the data is diminished the speed 
of the calculation increases. 

The frequency resolution of the data is another aspect that will affect program speed and 
parameter estimation ability. For example, with 32s of data the Fourier transform points are 
separated by 1/32 Hz. We found this precise frequency resolution to be unnecessary. We 
increased the program speed without diminishing parameter estimation ability by decreasing 
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the resolution to 1Hz. All the results presented in this paper utilized a frequency resolution 
of 1Hz. However, we feel that a detailed study of the coupling of frequency resolution with 
computational speed and effectiveness will be necessary. 

The simulated annealing procedure we used to adequately sample the parameter space 
is another topic where studies will be needed in order to optimize the speed of the routine. 
For the variables we used in our program, either in terms of the compact masses mi and 
rri2 or the total mass m t and the mass ratio t], it was necessary to introduce an pseudo- 
temperature T into the likelihood to initially burn-in the chain and converge to the correct 
parameters. Without T it would take a prohibitively long time for the chain to converge; 
the incorporation of simulated annealing into MCMC techniques has been demonstrated to 
decrease the burn-in time for the chain ||35j . In the studies presented here we would typically 
start with T~500, and it could take anywhere from 10 3 to 2 x 10 4 cycles for the chain to 
reach convergence. Smaller SNR events took longer to burn-in. The chain would run for 
about 10 4 after burn-in, and the mean values for the parameters were computed and used 
as the starting values for the next chain with its reduced value of T. The value of T would 
be decreased by an order of magnitude, so that a typical procedure would involve runs with 
= 500, 50, 5 and then 1. This cooling schedule is definitely not optimized for speed and 



efficiency The use of better coordinates |36j may provide a more well behaved parameter 
space, which in turn could help the chain mix more efficiently and reduce the time needed 
for simulated annealing. This will be investigated. 

All of the calculations presented in this paper were conducted on a 500 MHz pc. When 
the frequency span of the study extended from 30Hz to 730Hz, with 1Hz resolution, 10 4 
cycles of the MCMC took 1.5 hours. We often needed about 2.5 x 10 4 cycles to generate 
good kernel densities. When the frequency span extended from 30Hz to 130Hz, with 1Hz 
resolution, 10 4 cycles of the MCMC typically took approximately 10 minutes. 
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V. DISCUSSION 



In this paper we have demonstrated that the Gibbs sampler can be used to estimate the 
parameters for a coalescing binary system for signals detected by LIGO antennas. While we 
have not yet optimized this procedure for speed, we have shown that within just a few hours 
of running time on a 500MHz pc we can generate kernel densities for the parameters. The 
MCMC can replace a systematic march through a grid of templates in parameter space, and 
instead make a probabilistic random walk that is helped by the weighting of the product of 
the likelihood and the a priori distributions of the parameters. 

We have only concentrated on using our MCMC procedure for estimating signals in data 
sets where a signal is assumed to exist. Using MCMC techniques as a method to find signals 
from the continuous output of the LIGO detectors is another research topic that is not 
covered here. A question that must be answered is how efficiently can the MCMC method 
identify signals, and how often does it miss them. 

A great advantage of MCMC methods is that the calculational time does not scale 
exponentially with parameter number, but in fact scales almost linearly. Applications of 
state-space modelling in finance, such as stochastic volatility models applied to time series 
of daily exchange rates or returns of stock exchange indices, easily have 1000 - 5000 param- 
eters; specially tailored MCMC algorithms can effectively and efficiently sample the phase 
space |27| , |37H 39H . The number of parameters for coalescing binary signals will grow when, 
for example, the spin of the compact objects is included. We will also eventually expand 
our MCMC study to the problem of examining the signals detected by two or more inter- 
ferometers simultaneously. In addition to the parameters for pertaining to the coalescing 
binary, there will be the delay in arrival times between the detectors and the polarization 
sensitivities that can then infer the location in the sky of the source pO . MCMC methods 
offer great promise for parameter estimation with coalescing binary signals, especially as 
parameter numbers increase. 
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Figure Captions 

Fig. 1 Estimate of the total mass, m t , versus SNR. The actual total mass for the 
signal is m t = 4.9M . Error bars correspond to the 2.5 and 97.5 percentile of the posterior 
distribution of m t (which gives a 95% posterior credibility interval for m t ). 

Fig. 2 Estimate of the mass ratio, 77, versus SNR. The actual mass ratio for the signal is 
rj = 0.2041. Error bars correspond to the 2.5 and 97.5 percentile of the posterior distribution 
of rj (which gives a 95% posterior credibility interval for 77). 

Fig. 3 Estimate of the (rescaled) amplitude of the gravity wave, N, versus SNR. The 
amplitude of the inferred signal varies linearly with the SNR, as it should. Error bars 
correspond to the 2.5 and 97.5 percentile of the posterior distribution of iV (which gives a 
95% posterior credibility interval for N). 

Fig. 4 The kernel densities for the parameters (m t , 77, N, t c , and ipo), generated from 
the Gibbs sampler with SNR = 4.4. 

Fig. 5 Example of the operation of the Gibbs sampler, with trace plots for the parameters 
displayed, with SNR = 4.4 and the simulated annealing pseudo-temperature of T = 100. 
One can see how the chain burns-in and achieves convergence. 

Fig. 6 Estimate of the total mass, m t) versus SNR. The actual total mass for the 
signal is m t = 17M . Error bars correspond to the 2.5 and 97.5 percentile of the posterior 
distribution of m t (which gives a 95% posterior credibility interval for m t ). 

Fig. 7 Estimate of the mass ratio, 77, versus SNR. The actual mass ratio for the signal is 
77 = 0.2491. Error bars correspond to the 2.5 and 97.5 percentile of the posterior distribution 
of 77 (which gives a 95% posterior credibility interval for 77) . 

Fig. 8 Estimate of the (rescaled) amplitude of the gravity wave, N, versus SNR. The 
amplitude of the inferred signal varies linearly with the SNR, as it should. Error bars 
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correspond to the 2.5 and 97.5 percentile of the posterior distribution of N (which gives a 
95% posterior credibility interval for N). 

Fig. 9 The kernel densities for the parameters (m t , rj, N, t c , and </? ), generated from 
the Gibbs sampler with SNR = 4.4. 

Fig. 10 Example of the operation of the Gibbs sampler, with trace plots for the pa- 
rameters displayed, with SNR = 4.4 and the simulated annealing pseudo-temperature of 
T = 100. One can see how the chain burns-in and achieves convergence. 
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